AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
TinyV reward system

# TinyV reward system

Tinyv 1.5B
Apache-2.0
Fine-tuned based on the Qwen/Qwen2.5-1.5B-Instruct model, using the TinyV reward system, which can provide more accurate reward signals in the post-training of efficient reinforcement learning (RL) and significantly improve RL efficiency and the performance of the final model.
Large Language Model Transformers
T
zhangchenxu
1,124
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase